Semantic Evidence for Automatic Identification of Cognates
نویسندگان
چکیده
The identification of cognate word pairs has recently started to attract the attention of NLP research, but it is still a rather unexplored area requiring more focused attention. This paper builds on a purely orthographic approach to this task by introducing semantic evidence in the form of monolingual thesauri and corpora to support the identification process. The proposed method is easily portable between languages and specialisation domains, since it does not depend on the availability of parallel texts or extensive knowledge resources, requiring only monolingual corpora and a bilingual dictionary encoding correspondences only the core vocabularies of both languages. Our evaluation of the method on four different language pairs suggests that the introduction of semantic evidence in cognate detection helps to substantially increase the precision of cognate identification.
منابع مشابه
Multilingual lexical resources to detect cognates in non-aligned texts
The identification of cognates between two distinct languages has recently started to attract the attention of NLP research, but there has been little research into using semantic evidence to detect cognates. The approach presented in this paper aims to detect English-French cognates within monolingual texts (texts that are not accompanied by aligned translated equivalents), by integrating word...
متن کاملA Persian-English Cross-Linguistic Dataset for Research on the Visual Processing of Cognates and Noncognates
Finding out which lexico-semantic features of cognates are critical in cross-language studies and comparing these features with noncognates helps researchers to decide which features to control in studies with cognates. Normative databases provide necessary information for this purpose. Such resources are lacking in the Persian language. We created a dataset and determined norms for the essenti...
متن کاملاختلال در شبکه معنایی بیماران اسکیزوفرنیک: آمادهسازی معنایی با ارائه همزمان دو آمادهساز
Abstract Objectives: The present study was designed to investigate the automatic activation of seman-tic priming in schizophrenic patients. Method: 36 schizophrenic patients and 36 normal sub-jects participated in two experiments. In experiment one, the effect of semantic relation on iden- tification of degraded targets was examined between a series of single prime words and single target words...
متن کاملAutomatic cognate identification with gap-weighted string subsequences
In this paper, we describe the problem of cognate identification in NLP. We introduce the idea of gap-weighted subsequences for discriminating cognates from non-cognates. We also propose a scheme to integrate phonetic features into the feature vectors for cognate identification. We show that subsequence based features perform better than state-ofthe-art classifier for the purpose of cognate ide...
متن کاملبرچسبزنی خودکار نقشهای معنایی در جملات فارسی به کمک درختهای وابستگی
Automatic identification of words with semantic roles (such as Agent, Patient, Source, etc.) in sentences and attaching correct semantic roles to them, may lead to improvement in many natural language processing tasks including information extraction, question answering, text summarization and machine translation. Semantic role labeling systems usually take advantage of syntactic parsing and th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007